Europhysics Letters 



PREPRINT 



Maximum matching on random graphs 

Haijun Zhou^^^'^ and Zhong-can Ou-Yang^ 

^ Institute of Theoretical Physics, the Chinese Academy of Sciences, P.O. Box 2735, 
Beijing 100080, China 

^ Interdisciplinary Center of Theoretical Studies, the Chinese Academy of Sciences, 
P.O. Box 2735, Beijing 100080, China 

Max- Planck- Institute of Colloids and Interfaces, 144^4 Potsdam, Cermany 



PACS. 89. 20. -a - Interdisciplinary application of physics. 
PACS. 75.10.Nr - Spin-glass and other random models. 
PACS. 02. 10. Ox - Combinatorics; graph theory. 



Abstract. - The maximum matching problem on random graphs is studied analytically 
by the cavity method of statistical physics. When the average vertex degree c is larger than 
2.7183, groups of max-matching patterns which differ greatly from each other gradually emerge. 
An analytical expression for the max-matching size is also obtained, which agrees well with 
computer simulations. Discussion is made on this continuous glassy phase transition and the 
absence of such a glassy phase in the related minimum vertex covering problem. 



Introduction. - Studies on spin glasses focus on systems with random frustrations |2E1- 
The energy landscape of a spin glass is very rough. When environmental temperature is 
lower than certain critical value, the system gets trapped in one of many local regions of the 
whole configurational space (ergodicity breaking). The deep connection between frustrations 
in spin glasses and constraints in combinatorial optimization problems was noticed by many 
authors, and the replica method developed during the study of spin glass physics |2IS| has 
been applied to hard combinatorial optimization problems including the k-satisfiability 0], 
the number partitioning the Euclidean matching 6_, the vertex covering [J], and many 
others. However, sophistication of replica method renders analytical discussion to be limited 
usually to the replica-symmetric (RS) level. 

Recently, considerable success has been attained in applying the cavity method to 
combinatorial optimization problems The cavity formalism enables analytical 

calculations to be carried out to first-step replica-symmetry-breaking (RSB). For random 3- 
satisfiability (3-Sat) problems it was discovered PO] that, between the Easy-SAT and UNSAT 
phase there is a Hard-SAT phase. The Hard-SAT phase is a glassy phase, with great many 
states of the same ground-state energy density which are separated by very high energy bar- 
riers. The Easy-SAT to Hard-SAT phase-transition is abrupt. Similar behavior was observed 
in random 3-XOR-Sat the other hand, work performed on minimum vertex covering 

(min-covering) of random graphs !13| suggested there is no proliferation of ground-states in 
this model system even when replica symmetry is broken. 
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The following questions arise: (1) Why in the RSB domain, the structural entropy density 
(complexity) of min-covering is zero? (2) If there is a glassy phase-transition in a combinatorial 
optimization problem, is it always discontinuous? In this article, we looked into these questions 
by studying the maximum matching (max-matching) problem of random graphs. We chose 
to work with max-matching because (a) as will be shown later, max-matching is equivalent to 
min-covering in the RS parameter range and, (b) there exist polynomial algorithms to find a 
max-matching, so that theory can be checked by experiment. 

We found that, when the average number of nearest-neighboring vertices of each vertex 
in a random graph, called the average vertex degree c, is larger than Ccr = e the system is 
in a glassy phase with many max-matchings that have very large Hamming distance between 
each other. Contrast to the situations in random 3-Sat and 3-XOR-Sat, these patterns appear 
gradually; their number increases exponentially with c. When c < e, max-matching and 
min-covering are equivalent (there is no edge- redundancy) . When c > e, the appearance of 
redundant edges, while adding great freedom to max-matching, cause severe constraints to 
min-covering patterns, since each edge needs to be covered in the later case. 

We also obtained an analytical expression for the average max-match size, which is in 
agreement with computer simulations in the whole range of parameter c. 

Model and cavity calculations. - Consider a random graph G(n, c/(n — 1)) |14j . There 
are n vertices in the vertex set V = {vi, V2, ■ ■ ■ ,Vn}', between any pair of vertices Vi and vj, 
an edge is present in the edge set E with probability c/(n — 1) and absent with probability 
1 — c/(7i — 1). The average number of edges incident to a randomly chosen vertex is c; and for 
large graph size n, the number k of edges associated with a given vertex obeys the Poisson 
distribution Pp{c, k) = e~'^c^ /k\ A matching M is a subset of the edge set E such that 
no two edges in M share a common vertex. We are interested in the max-matching M*, a 
matching with the largest number of edges. What is the size |Af*| of a max-matching for a 
random graph G(n, c/{n — 1)) constructed by the above mentioned procedure? This question 
could be answered analytically and algorithmically. First, we investigate random graph max- 
matching by zero-temperature statistical physics, and an analytical formula will be given. 
The max-matching patterns could be classified into different groups based on their mutual 
similarity. It turns out that the total number of such groups grow exponentially with graph 
size n when the average vertex degree c > e. 

We associate with each edge(^) Ci of graph G(n, c/(n— 1)) a spin variable S = {0, 1}. There 
are altogether 21-^1 possible microscopic spin configurations for the graph. We introduce the 
following energy functional for each microscopic spin configuration: 



where A (> 1) is a constant and the prime indicates the second summation is restricted to 
edges e, and ej that share a common vertex. For a max-matching M*, we assign Se^ = 1 
to all the edges £ M* and Se^ = to all the other edges ej. The configurational energy 
is — |M*|. Because A > 1, no microscopic spin configurations could attain energy lower than 
— |M*|; therefore, finding the max-matching size is converted to finding the ground energy 
states of eq. Furthermore, each energy (local) minimum configuration corresponds to a 
matching of the graph ll^l . Hereafter we consider only these energy minimum configurations. 
For very large systems (n ^ 1) we group them into different (macroscopic) states. A state 

(^)Notice that, different from ref. 1131 . spins are associated with edges of the graph and cavity fields (see 
below) are associated with vertices of the graph. 




(1) 



Haijun Zhou and Zhong-can Ou-Yang: Maximum matching on random graphs 



3 



of the system contains a set of microscopic spin configurations, all of which have the same 
(minimum) energy and only finite number (or number at most proportional to with 9 < 1) 
of spin flips is needed to transit from one to another of these configurations ^ . 

For each state, say a, we define a quantity haivi) (called the cavity field) for each vertex 
Vi according to the following rule: ha(vi) = —1, if in each microscopic configuration of state 
a one of the edges that meets Vi has spin value S = 1; and ha{vi) = 0, if in one or more 
microscopic configurations of state a all the edges that meet Vi have spin value S = Q. 

A random graph G{n, c/{n — 1)) can be obtained by first generating a random graph 
G(n — 1, c/(n — 1)) and then adding a new vertex (say uq) and setting up fc edges ei, 62, . . . , 
to fc randomly chosen vertices (say wi, W2, . . . , Wfc) of the old graph. The value of fc is governed 
by Pp{c,k) when n is considerably large. If the original system is in state a, the energy 
difference between the enlarged and the original system is 



mm 



fc-1 



=1 i=J+l 



(2) 



For the purpose of favoring microscopic spin configurations that lead to an energy decrease, 
we introduce an "effective inverse temperature" parameter y Denote Pvf\h) as the 

distribution of the cavity field h of vertex Vi over different states a of the graph at given 
effective inverse temperature y. The following recursive equation could be written down for 
this probability distribution: 



p^yHho) = s{k)5{ho) + [1 - s{k)]c'[[[ / dh,piy\h,)]e-y 



^''^6[ho + 9iJ2h^ + k)], (3) 
i=i 

where S{x) is the Kronecker symbol; C is a normalization constant; and 0{x) = 1 if x > and 
= otherwise. Equation JH)) is understood as follows: (1) if vertex vq is isolated, it feels no 
field (the first term); (2) if A; > 1 but in state a each of wq's neighbors has already associated 
with an edge of S" = 1, the edges of vq should all assign S" = to decrease energy (ho = 0); 
(3) otherwise, it is energetically favorable to assign S" = 1 to one of the edges of vertex ■^o 
{ho = !)• Situation (2) and (3) correspond to the second term of eq. 

In deriving eq. it was assumed that, before the addition of vertex vq, the cavity field 
distributions for the vertices ui,f2,. . • are mutually independent [SJI^. This assumption 
is based on the argument that, before vq is added these vertices are usually far apart. It 
certainly does not hold for regular networks. 

A careful inspection of cqs. Q and |j2Jl leads to the following form for the cavity field 
distribution: 

{S{h + 1), probability pi 

S{h), probability p2 

aS{h + 1) + (1 — a)S{h), probability p3 

where < a < 1 is a random number obeying certain distribution; and 

p, = l-e-'P\ P2 = e--(i-Pi), p3 = l-pi-p2. 

We introduce the following transformation for variable a in eq. |0J: a = 1/[1 + 
could be shown that t > is governed by the population dynamics equation 



(4) 



(5) 

y/^]. It 



-y/2 



n[l + r,e-./2]_l 
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m— 1 



^^^n[/dr.p(^)(r.Mr-^^^^) (.-.+00). (7) 



The free energy density at given value of the effective inverse temperature y is obtained 
following the procedure given in ref. jj^l (see also ref. ^3). The expression reads 



'^{y) = -j^i-^ - Pi + 1)2 + cp2 ~ CP1P2) + iVi/y) (1 + cp2)ln(r + e v/"^) 



+ -P3ln(l + T1T2 + Tie-yl'^ + T2e-yl'^) - (1 + c - cpi)ln(l + re-y/"^) (8) 
1 , 

2(-l -P1+P2 + cp2 - CP1P2) 



+ (P3/y)[(l + cp2)lnT+-p^ln(H-TiT2)] (y>l). (9) 

In the above equation and hereafter, an overline means performing an averaging. 

We are interested in the average size of a max-matching for a random graph G(n, c/{n — l)). 
This corresponds to the free energy of the system at j/ — s- 00, provided that the complexity 
or structural entropy density, T,{y) = d^{y)/d{l/y), is non-negative (that is, if there exist 
states at y ^ 00) ^Hl- When the average vertex degree c < e, the only solution of eq. ® is 
Pi — 1 — W{c)/c and p2 = W{c)/c, where W{c) is the root of We^ = c. In this case the 
average size of the max-matching (in units of the graph size n) is 



M* 

x{c)= lim J ^- = l-W{c)/c~W^{c)/2c. (10) 

n — ^00 Ji 

We see the average max-matching size is identical to the average minimum vertex cover size 
obtained in refs. |7ll6j . This should be the case. Actually, for c < e the leaf- removal algorithm 
of ref. |17j will at the same time report a min-covcring and a max-matching. Expression 
is exact for c < e; while for c > e it is just an upper-bound (for c > 4 it even exceeds 1/2). 

When the average degree c > e, eq. ^ has another solution with pi+ P2 < 1, and eq. © 
leads to the following max-matching size expression 

x{c) ^ {1 +pi - P2 - cp2 + cpip2)/2. (11) 

The structural entropy density at effective inverse temperature y = 00 is 



S = p3{l + cp2)lnT + (c/2)p^ln(l + T1T2). (12) 

Based on eqs. Q and (|12l) . the ground-state structural entropy density is calculated nu- 
merically for various values of c and is shown in fig. ^ The structural entropy density is zero 
for c < e and it gradually increases from zero as c > e. Therefore, for c > e there exist 
many degenerate ground states and the system is in a zero-temperature glassy phase. Fig- 
ure ^ also reveals that, the larger the value of c, the larger the value of the structural entropy 
density. This observation is quite different from the case of random 3-Sat and 3-XOR-Sat 
problem |l()ll??j. In random 3-Sat and 3-XOR-Sat problems, at the phase-transition point, 
the complexity jumps from zero to a finite value and then gradually decreases to zero as the 
average vertex degree c increases. 

The "phase-diagram" fig.^is also quite different from that of the minimum vertex covering 
problem ^^l- It is noticeable that, while the minimum vertex covering problem could be 
mapped to an energy functional very similar to eq. ^ in form, in that system there is no 
zero-temperature glassy phase jTS). We will discuss this in the last section. 
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Fig. 1 - The structural entropy density at y = oo [eq. I|12fl ] as a function of average vertex degree 
c. The structural entropy density could exceed In 2, since the total number of configurations is 2^'^^^ 
rather than 2^. 

In fig. [21 the average size of max-matching eq. is shown for various values of c. Also 
shown are the results obtained by an exact algorithm mentioned in the appendix. The ana- 
lytical results are in complete agreement with the experimental result in the whole range of c. 
We suggest that the analytical expressions eq. Hll|) and (|12l) obtained by statistical mechanics 
method are exact in the limit n ^ 1. The scaling of max-matching size for c ^ 1 is 

x{c) - i - e-72 + c'e-'72. (13) 

An interesting question is to ask the probability for a random graph G{n^c/{n — 1)) to 
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Fig. 2 - The average size of the max-matchings for a random graph of fixed average vertex degree 
(solid line) and its asymptotic form (dashed line, see eq. (O). Filled circles are average max-matching 
sizes obtained by a matching algorithm (see appendix). At each given average vertex degree c, in 
the numerical simulation, 1000 samples of a random graph of 10000 vertices were generated and the 
max-matching size for each of them was obtained and then averaged. The errors in the experimental 
data are all smaller than the radius of the circles. Inset is the probability of a random graph of n 
vertices to have a matching containing at least 0.49n edges: n — 10000, filled circles (2000 samples per 
point); n = 5000, empty circles (2000 samples per point); n = 1000, empty triangles (3000 samples 
per point). 
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have a perfect matching, a matching that has n/2 edges. If the random graph contains 
one or more isolated vertices, of cause there could be no perfect matching. The average 
number of isolated vertices in a random graph G{n,c/{n — 1)) is (no) — ne~'^. It must be 
less than unit for a perfect matching to be possible. Therefore, c ^ ln(n). For n = 10^, 
c ~ 9.21. Numerical experiment reveals that, around c ~ 10 there is a sharp change in the 
probability of perfect matching for random graphs of size n = 10^. This is in agreement with 
our above analysis. For c > 10 probability of perfect matching approaches unity while for 
c < 10 it approaches zero. Such a phenomenon was also observed in many other combinatorial 
optimization problems. In the inset of fig. [21 we demonstrate the experimental probability for 
a random graph G'(n, c/(n — 1)) of n = 10* to have a matching containing at least 4900 
edges. We see a sharp transition occurs at c ~ 4.1, as would have been expected according 
to eq. Hll|l . which predicted that at c = 4.1 the average max-matching size is 0.49n. Figure[21 
also shows that the sharpness of this transition is related to system size n, because the relative 
importance of fluctuations in the max-matching size scales as 

Conclusion. - To summarize, we have obtained an analytical expression for the average 
max-matching size of a random graph as a function of the average vertex degree c, and this 
formula was verified by a numerical experiment. The analytical calculation was performed on 
a designed spin model using the cavity method of statistical physics. When c > e the energy 
landscape of this model system has many valleys of the same energy, separated by large 
energy barriers between them. The total number of such low-energy valleys is also estimated. 
Different from the discontinuous glassy phase transition in problems such as random 3-Sat, 
the transition in max-matching is continuous, with a continuously growing structural entropy. 

In a random graph, why there are great many maximum matching patterns but only 
very few minimum vertex-covering patterns [T5I ? We think the reason is edge redundancy. 
When the average vertex degree c > e, redundant edges give additional freedoms to the max- 
matching patterns, but it cause additional constraints to the min-covering patterns, as all 
edges should be covered. As was shown here, when c < e max-matching and min-covering 
are equivalent for random graphs (in this sense, there is no redundant edges). Beyond c = e, 
there may still be deep connections between the solution spaces of these two problems. An 
interesting question is to investigate the possibility of constructing a solution to the min- 
covering problem with the guide of the solutions of the corresponding max-matching problem. 

The excellent agreement between theory and experiment suggests that, for short-range 
spin-glass models on finite connectivity random graphs, the cavity field assumption that cor- 
responds to the first-order replica-symmetry breaking might be accurate enough that no higher 
order RSBs is needed for many purposes. For infinite connectivity SK model 21 ^.nd finite 
connectivity models on random graphs [11121 we now enjoy satisfactory understandings. A 
challenge now is the solution of spin glass models on finite connectivity regular lattices. 

* * * 

We are grateful to Professor YU Lu for support and for revising earlier versions of the 
manuscript. H.Z. acknowledges the hospitality of ITP and ICTS. 

Appendix: matching algorithm. - A max-matching pattern can be found with time 
proportional to polynomials of graph size n I18| . The algorithm mentioned below is inspired 
by the concept of matching-alternating chain Suppose M is a matching of a graph (V, E). 
An incomplete matching-alternating chain (IMAC) of length zero is composed of a vertex that 
does not meet M; an IMAC of length p > 1 is composed of a sequence of distinct vertices 
vo,vi, . . . ,Vp such that (1) {vi, Vi+i} G E for i G {0, ... ,p — 1}; (2) vq does not meet M and 
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Vp meets M; (3) the first, third,. . ., edges do not belong to AI ; and (4) the second, fourth,. . ., 
edges belong to M. Obviously, the length p of an IMAC must be even. For a complete 
matching-alternating chain (CMAC), Vp also should not meet M. The length of a CMAC 
must be odd. 

The algorithm works by inputting a matching M. It either returns a new matching with 
size equaling to |M| + 1 or stops if the input matching is a max-matching. 

(1) . Find all the IMACs of length zero. Set all vertices as unlabeled. 

(2) . If there is no IMACs, stop. Else, select one IMAC (say chain C^^, which is specified 
by its last element Va). Construct a set A = {vi : Vi unlabeled, {va,Vi} £ {E — M)}. 

(3) . If 1^1 7^ 0, select one of its elements (say Wf,) and delete vt from A. If Vf, is already 
in chain /^^, return to step (3). Else, mark vi, as labeled. If V(, does not meet A/, the chain 
formed by adding vi, to the end of Cy^ is a CMAC; jump to (3a). Else, jump to (3b). 

(3a). Denote the set of edges in this CMAC as set C. Denote / = C n Af as the set of 
edges which are shared by C and M. Then AI' = (M — /) U (C — /) is a matching with of 
size |M| + 1. Return Ad'. 

(3b). Suppose Vc is the vertex such that {vb,Vc} G M. Create a new IMAC by adding Vb 
and then Vc to chain Cy^. Return to step (3). 

(4) . Delete chain C^^, and return to step (2). 

If the algorithm returns a new matching, the algorithm is run again with this new matching 
as input, till no new matching is returned. It could proved that the last returned matching is 
a max-matching. 

For each given value of c, we generate 1000 — 3000 instances for a random graph G(n, c/(n — 
1)) of order n 10^. The max-matching size for each of them is obtained by the above- 
mentioned algorithm. The average max-matching sizes are shown in fig. |2] (filled circles). 
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